Aural navigation of information rich visual interfaces

ABSTRACT

A method comprising generating, by a computer, a model of a website using user interaction primitives to represent hierarchical and hypertextual structures of the website; generating, by the computer, a linear aural flow of content of the website based upon the model and a set of user constraints; audibly presenting, by the computer, the linear aural flow of the content such that the linear aural flow of content is controlled through the use of user supplied primitives, wherein, the linear aural flow can be turned into a dynamic aural flow based upon the user supplied primitives.

This patent application claims priority to copending U.S. provisionalapplication No. 61/699,748, filed on Sep. 11, 2012 and incorporates thesame herein by reference.

BACKGROUND

This specification relates to navigation of information and content richinterfaces and applications and specifically the navigation of web basedinterfaces and applications. Accessing the mobile web on-the-go and in avariety of contexts (e.g., walking, standing, jogging, or driving) isbecoming more and more pervasive. Mobile users are often engaged inanother activity when it is inconvenient, distracting or even dangerousto continuously look at the web display device at all times. Althoughexisting visual user interfaces can be efficient to support quickscanning of a page, they typically require highly focused attention andmay not work well or require a dangerous level of attention in certainsituations. It is known that the use of audio-based interfaces of mobileand non-mobile devices during secondary tasks are less distracting anddemanding when compared to visual interfaces.

Another concern is the degree of required or desired interactivity withthe web application. Continuous or visually detailed interaction with aconventional web interface requires the user to expend visual attentionto the web interface. For example, a user is walking on a city streetand would like to catch up with the weekly local news during his10-minute walk to work. Continuous interaction with a conventional newssite on your smart phone would force the user to scan the homepage,ascertain the latest news, selecting a category, potentially followed byselecting a subcategory, and then finally select a news story to read.Once read, the user may want to know more about it or select anothernews story in the same category, etc. Much of this interactivity is inconflict with the current task of the user's walk to work. Furthermore,the effort expended to both walk and visually interact with the webinterface likely amounts to an undesirable user experience. Thus, thereis a need for an audio-based system of interaction with data richinterfaces. The present invention addresses this need.

SUMMARY

This specification describes technologies relating to audio based webnavigation and audio web content presentation.

In general, one innovative aspect of the subject matter described inthis specification can be embodied in methods that include the actionsof generating a model derived from the analysis of user interactionsthat represents the hierarchical and hypertextual structures of awebsite and using that model and user supplied constraints to generate alinear aural flow of content from the said website. An audiblepresentation based on the linear aural flow is then presented to theuser with options for the user to dynamically direct and alter thecontent of the audio presentation.

Other embodiments of this aspect include corresponding systems,apparatus, and computer programs, configured to perform the actions ofthe methods, encoded on computer storage devices.

The details of one or more embodiments of the subject matter describedin this specification are set forth in the accompanying drawings and thedescription below. Other features, aspects, and advantages of thesubject matter will become apparent from the description, the drawings,and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an example environment in which a paradigmfor implementing aural navigation flows on rich architectures managescontent delivery services.

FIG. 2 is an example web page such as might be navigated by an auralnavigation system.

FIG. 3 is a block diagram of an aural navigation system's linear fullflow of a collection of web pages.

FIG. 4 is a block diagram of an aural navigation system's user definedflow of a collection of web pages.

FIG. 5 is a sample block diagram of a group aural flow in a simplifiedexample web architecture.

FIG. 6 is a representation of a sample user interface for a mobiledevice that supports aural navigation flows.

FIG. 7 is a representation of accelerometer-based shake gesture tointeract with an aural flow.

FIG. 8 is a block diagram of a personal computing device capable ofimplementing a portion or all of the described technology.

Like reference numbers and designations in the various drawings indicatelike elements.

DETAILED DESCRIPTION

Before the present methods, implementations and systems are disclosedand described, it is to be understood that this invention is not limitedto specific synthetic methods, specific components, implementation, orto particular compositions, and as such may, of course, vary. It is alsoto be understood that the terminology used herein is for the purpose ofdescribing particular implementations only and is not intended to belimiting.

As used in the specification and the claims, the singular forms “a,”“an” and “the” include plural referents unless the context clearlydictates otherwise. Ranges may be expressed in ways including from“about” one particular value, and/or to “about” another particularvalue. When such a range is expressed, another implementation mayinclude from the one particular value and/or to the other particularvalue. Similarly, when values are expressed as approximations, forexample by use of the antecedent “about,” it will be understood that theparticular value forms another implementation. It will be furtherunderstood that the endpoints of each of the ranges are significant bothin relation to the other endpoint, and independently of the otherendpoint.

“Optional” or “optionally” means that the subsequently described eventor circumstance may or may not occur, and that the description includesinstances where said event or circumstance occurs and instances where itdoes not. Similarly, “typical” or “typically” means that thesubsequently described event or circumstance often though may not occur,and that the description includes instances where said event orcircumstance occurs and instances where it does not.

This application describes a novel, semiinteractive aural paradigm forimplementing aural navigation flows on rich architectures enabling usersto listen to information-rich interfaces, such as web pages, utilizingcomplex, hypertextual structures while interacting with the interfacesinfrequently. Further, this technology provides for the “aural flow” andinvestigates of new ways in which different types of aural flow can beapplied to conventional information rich architectures such as webpages. An aural flow is a design-driven, concatenated sequence of pagesthat can be listened to with minimal interaction required. A flow isgoverned by aural design rules that determine which pages of theinformation architecture to automatically concatenate and at which pointof the flow the user can interact.

This technology additionally provides the ability to quickly scanningthrough content-rich data interfaces, such as web pages, allowingeffective but time and/or contextual and/or physical constrainedscanning. Finally, the described technology provides a generic designframework applicable to any non-linear, content-rich architecture, suchas that which underlies modern web systems. For example, the describedtechnology is appropriate for any large website that featureshierarchical and hypertextual structures, such as a commerce, travelplanning, or tourism site, and the like.

FIG. 1 is a block diagram of an example environment 100 in which aparadigm for implementing aural navigation flows on rich architecturesmanages content delivery services. The example environment 100 includesa network 102, such as a local area network (LAN), a wide area network(WAN), the Internet, or a combination thereof. The network 102 connectswebsites 104, user devices 106 (also known as personal computingdevice), content sponsors (e.g., advertisers 108), and an auralnavigation system advertisement management system 120. The exampleenvironment 100 may include many thousands of websites 104, user devices106 and advertisers 108.

A website 104 is one or more resources 105 associated with a domain nameand hosted by one or more servers. An example website is a collection ofweb pages formatted in the hypertext markup language (HTML) that cancontain text, images, multimedia content and programming elements, suchas scripts. Each website 104 is maintained by a publisher/sponsor, whichis an entity that controls, manages and/or owns the website 104.

A resource 105 is any data that can be provided over the network 102. Aresource 105 is identified by a resource address that is associated withthe resource 105. Resources include HTML pages, word processingdocuments, and portable document format (PDF) documents, images, video,and feed sources, to name a few. The resources can include content, suchas words, phrases, images and sounds, that may include embeddedinformation (such as meta-information in hyperlinks) and/or embeddedinstructions (such as JavaScript scripts). Units of content that arepresented in (or with) resources are referred to as content items.

A user device 106 is an electronic device that is under control of auser and is capable of requesting and receiving resources over thenetwork 102. Example user devices 106 include personal computers, mobilecommunication devices, and other devices that can send and receive dataover the network 102. A user device 106 typically includes a userapplication, such as a web browser, to facilitate the sending andreceiving of data over the network 102.

A user device 106 can request resources 105 from a website 104. In turn,data representing the resource 105 can be provided to the user device106 for presentation by the user device 106. The data representing theresource 105 can also include data specifying a portion of the resourceor a portion of a user display (e.g., a presentation location of apop-up window or in a slot of a web page) in which advertisements can bepresented. These specified portions of the resource or user display arereferred to as slots or advertisement slots.

To facilitate searching of these resources 105, the environment 100 caninclude a search system 112 that identifies the resources 105 bycrawling and indexing the resources 105 provided by the publishers onthe websites 104. Data about the resources can be indexed based on theresource 105 to which the data corresponds. The indexed and, optionally,cached copies of the resources 105 are stored in a search index 114.

User devices 106 can submit search queries 116 to the search system 112over the network 102. In response, the search system 112 accesses thesearch index 114 to identify resources that are relevant to the searchquery 116. The search system 112 identifies the resources in the form ofsearch results 118 and returns the search results 118 to the userdevices 106 in search results pages. A search result 118 is datagenerated by the search system 112 that identifies a resource that isresponsive to a particular search query, and includes a link to theresource. An example search result 118 can include a web page title, asnippet of text or a portion of an image extracted from the web page,and the URL of the web page. Search results pages can also include oneor more slots in which other content or advertisements can be presented.

When a resource 105 or search results 118 are requested by a user device106, the advertisement management system 110 receives a request foradvertisements to be provided with the resource 105 or search results118. The request for advertisements can include characteristics of theslots that are defined for the requested resource or search resultspage, and can be provided to the advertisement management system 110.

For example, a reference (e.g., URL) to the resource for which the slotis defined, a size of the slot, and/or media types that are eligible forpresentation in the slot can be provided to the advertisement managementsystem 110. Similarly, keywords associated with a requested resource(“resource keywords”) or a search query 116 for which search results arerequested can also be provided to the advertisement management system110 to facilitate identification of advertisements that are relevant tothe resource or search query 116.

Based on data included in a given request, the advertisement managementsystem 110 selects advertisements or other content that is eligible tobe provided in response to the request (e.g., eligible advertisements).For example, eligible advertisements can include advertisements havingcharacteristics matching those of slots and that are identified asrelevant to specified resource keywords or search queries 116. In someimplementations, advertisements that have target keywords that match theresource keywords or the search query 116 are selected as eligibleadvertisements by the advertisement management system 110.

A targeting keyword can match a resource keyword or a search query 116by having the same textual content (“text”) as the resource keyword orsearch query 116. The relevance can be based, for example, on rootstemming, semantic matching, and topic matching. For instance, anadvertisement associated with the targeting keyword “hockey” can be aneligible advertisement for an advertisement request including theresource keyword “hockey.” Similarly, the advertisement can be selectedas an eligible advertisement for an advertisement request including thesearch query “hockey.”

A targeting keyword can also match a resource keyword or a search query116 by having text that is identified as being relevant to a targetingkeyword or search query 116 despite having different text than thetargeting keyword. For example, an advertisement having the targetingkeyword “hockey” may also be selected as an eligible advertisement foran advertisement request including a resource keyword or search queryfor “sports” because hockey is a type of sport, and therefore, is likelyto be relevant to the term “sports.”

The Aural navigation system 120 in some implementations provides ageneric design framework applicable to any non-linear, content-richarchitecture that is depicted in this example environment 100. The Auralnavigation system 120 provides for aural flows that are modeled on topof existing web information and navigation architectures and canco-exist with the traditional navigation and search mechanisms such asdepicted in this example environment 100. In some implementations, theaural navigation system 120 takes the existing structures and linearizesthem appropriately for the aural experience, eliminating the need forchanges to the existing websites. For example, the aural navigationsystem 120 can analyze an existing website 104 such as a news website,and linearize the website for audio presentation such that only simplecommands are needed by the user to navigate the audio presentation ofthe content of the news website.

In some implementations, the aural navigation system 120 can alsoutilize user directives, past user browsing and audio browsing history,user stated preferences, and other user information such as userlocation, user online socio presence, and user schedule when linearizinga website for audio presentation. User directives can be thought of asuser supplied defaults. For example, for sites that employ popularityordering of the article, the user can add defaults to instruct the auralnavigation system 120 to ignore articles below a certain ranking. Asanother example, the aural navigation system 120 can analyze a user'spast browsing history to determine that the user typically doesn'treview sports articles. Using such information, the aural navigationsystem 120 could neglect the sports content of a news website 104 whenlinearizing its content for audio presentation to that user. However,the aural navigation system 120 could override the user's past browsinghabits upon encountering sports content that has a significant socioconnection with the user. One example of a significant socio connectionwith the user is the sports content referencing a friend of the user.

In some implementations, the Aural navigation system 120 is able toperceive and respond to user input (oral or otherwise) and such userinput is interpreted within the context of the user's session and user'shistory. Example commands can include “Change to”, “Switch to”, “Back”,or “Previous” which are sensitive to the users' flow history, not adefault flow. Most implementations include various forms of bookmarkingenabling the continuing of a story or a topic from a previous session.In some implementations, multiple bookmarks can be maintained enablingthe user to go back and continue any of several paused stories. In someimplementations, the aural navigation system 120 implements a time-basedrelevance decay enabling past bookmarked articles to eventually losetheir bookmark if not referenced after a period of time.

Other sample commands include but are not limited to: “What's new?”,“Anything else (like this)?”, “Next” or “Skip”, “Stop” or “Pause”,“Resume” “Continue” or “Play” “Listen to” “Go to” “Switch to” or “Changeto”, “More” or “Tell me more”, and “Restart” or “Start over”. Note thatin some implementations, the aural navigation system 120 is implementedby a user device 106.

FIG. 2 is an example web page 200 such as might be navigated by an auralnavigation system 120. The example web page 200 is the resource 105. Theexample web page 200 includes a title 205, a search text slot 210, asearch button 215, a search results container 235 and advertisementslots 230 a-230 c. The search results container 235 contains the searchresults 118 of a search performed on this resource. In someimplementations, the aural navigation system 120 would provide a “linearflow” of the content of web page 200, contemplating pre-designated pageexits while other implementations provide a “user defined flow” enablinguser designated exits and/or content expansion.

FIG. 3 is a block diagram 300 of an aural navigation system's 120 linearfull flow 310 of a collection of web pages 104. In some implementationsof a linear full flow 310, the flow of information in is strictlylinear. Users are able to leave the flow 320 for related stories 330;upon finishing related stories, they are returned 340 to the originalflow. They are only able to jump forward and backward. The flow beginswith the first story 350 in the first group of topics 360. Headline,summary and full story are read in that order. Upon finishing the firststory, the system will move on to the next story in that group of topics360. Upon finishing the last story in a group of topics 360, the systemwill move on to the next group of topics 370.

In the block diagram 300, the lines 380 show the default flow. The lines320, 340, and 385 represent where users can interrupt the flow and moveto different parts. The system begins with an orientation cue lettingthe user know which group of topics they are listening to and theposition of the current story in the flow (e.g. “World News, Story 1 of3). As shown, each story contains a headline, summary, full story andoptional related stories.

In some implementations, the aural navigation system 120 can review theuser's browsing history, audio browsing history, location, device 106usage, socio presence, and calendar when generating a linear full flowof a collection of web pages. For example, a user's browsing history maydemonstrate a preference for only the top ranked stories from aparticular website 104. As such, the aural navigation system 120 cananticipate the user's continued browsing pattern by generating a linearflow of the webpages corresponding to the user's anticipatedpreferences. As another example, a user's browsing pattern could bebased upon his location. For example, the content that the user wishesto review in the car can be vastly different than the content that theuser wants to review when at work.

FIG. 4 is a block diagram 400 of an aural navigation system's 120 userdefined flow 410 of a collection of web pages. In some implementations,the aural navigation system 120 pauses after reading, audiblydisclosing, each dialogue (e.g. summary, full story, reader comments)allowing a user to speak a command. Users can interrupt this flow at anytime with any command from the vocabulary. In some implementations,users are able to speak the name of a group of topics (e.g. Politics,U.S., World) and begin the flow in that group. As such, in someimplementations each category of content from the website being accessedis available as a command. Categories act as keywords to allowing usersthe freedom to define their own navigation strategy.

In this example 400, line 420 indicates a scenario in which a userleaves the flow to listen to related stories and then changes categoriesduring the flow. Each story contains a headline, summary, full story,reader comments, and two related stories. Users are free to navigate thetopics as they please.

FIG. 5 is a sample block diagram 500 of a group aural flow 530 in asimplified example web architecture. Even in this simplified example,the non-linearity typical nature of such information sources is clearlyvisible. For example, the example contains different organizationalstructures (e.g., hierarchical and hypertextual).

In some implementations, the features of the architecture along with thehypertextual connections are modeled through a collection of primitivesand notions known in the art as Interactive Dialogue Module (IDM). IDMprovides basic concepts to describe and model hypertextual non-lineararchitectures. IDM is based on the notion that user interaction can beconsidered a dialogue between the user and the system. In a nutshell,core content entities (e.g., the news) are multiple topics. A multipletopic can be structured in dialogue acts (news story, commentary on thenews story) corresponding to different pages or interaction unitscomposing the topic. Multiple topics are typically organized in groupsof topics (e.g., U.S. news or world news) at different hierarchicallevels. Hypertextual or semantic associations are typed and can becharacterized as structural relationships between multiple topics.

Using IDM, one or more aural flows are modeled on top of existing webinformation and navigation architectures as represented by IDM. Thus,the aural flows can co-exist with the traditional navigation andinteraction paradigm. As a more complete explanation, an aural flow canbe thought of as a design-driven, concatenated sequence of web pagesthat can be listened to with minimal interaction. The flow is governedby aural design rules that determine which pages of the informationarchitecture to automatically concatenate and at which point of the flowthe user can interact. Such design rules can be proposed and refinedthrough various machine learning statistical techniques. For example,concatenation rules can be derived from topic popularity as determinedby related topic page hits. Or as another example, statistical modelscan be derived from topic popularity measures and web activity measures.Similar to predicting conversion for a sales event, the popularitymeasures and activity measures can be used to derive a “conversion-like”predictor capable of providing a predictive expectation value for topicpopularity.

In some implementations, the user is presented with two flow patterns.The user may either follow the Default Full Flow with little to nointeraction, or they may navigate where they please within the flow,creating their own User-Defined Flow. The Default Full Flow, unlessinterrupted by the user, follows a linear, concatenated flow ofinformation. Typical implementations provide the headline and summary ofarticles and then provide a portion of the content based upon the auralflow rules. For example, the aural flow rules could provide the fullcontent along with the commentary. The flow continues for each contentor story deemed to be above a certain threshold of interest orautomatically included by a default behavior. Upon finishing a content,the system will move on to the next content in that group of topics.Upon finishing the last story in a group of topics, the system will moveon to the next group of topics. The next group of topics can be basedupon the underlying web architecture, derived interest rules (where atopic may have a perceived higher interest than another), or from userderived interest rules (such rules can be derived from previous useractions or directly obtained through a user initiation where the useracts provide rules to govern topic interest).

However, a user can interrupt the default flow at any time with commandfrom the vocabulary (e.g. “stop” or “change to”). They may navigatewherever they please, at any time they want. This freedom of controlcreates a User-Defined Flow. An important feature of this flow type isthat the system will keep track of a user's history and context duringeach session. For example, saying a command like “Previous” will takethem to the last story they heard, not the previous story in the defaultflow. In some implementations, the User-Defined Flow still follows theorder of the Default Full Flow until a user utters a command. A table ofexample commands and their respective actions are presented below. Inmost implementations, users have at least four basic categories ofinteraction. The four categories are a) Pause, resume, replay and stop:The user can pause and resume the flow. The same dialogue act can bereplayed from the beginning. The user can also stop the flow to go backto the home page, b) Fast forward/backward browsing: The user can fastforward to go to the next dialogue act of the same topic or fastbackward to go back to the previous dialogue act of the same topic, c)Jump forward/backward browsing: The user can jump forward to the nexttopic or jump backward to the previous one at any time, d) Navigatingout of the flow: The user might want to listen to the related topic byclicking on its link. This action breaks the current flow and movesoutside the flow to the desired content (e.g., Related News).

Note that some implementations provide for a preliminary input orpresentation guiding input from a user. Table 2 provides an example ofthe different characteristics of aural flow types. This preliminaryinput enables the system to tailor the aural flow to the user's currentexpectations and/or limitations. For example, in some implementationsthe user can tell the system an amount of time that the user has withwhich to listen. For example, the user can tell the system that he has20 minutes, in which the aural flow through possible content will bestreamlined. As another example, a user after choosing a main group oftopic, such as U.S. news, could listen to all of the headlines or storysummaries in that category. Users would be able to navigate through allthe news stories in one category and continue the flow with the nextcategory of news or related stories.

It has been observed that two sources of error account for a sizableportion of the errors between user and aural flow interaction. The twosources of errors are speech recognition errors and navigation errors.Recognition errors occur when the system either does not understand theuttered command, or the uttered command is not within the scope of thecommand vocabulary. In some implementations, recognition errors arehandled through the notification of the user by the system emitting anearcon, a distinct and noticeable. After being such notified, the usercan then reissue the command or issue a different command.

It is worth mentioning that some implementations provide for a hybridinterface consisting of both the audio presentation along with a visualinterface dynamically cued to the content of the current flow. Suchhybrid interfaces enable the “At-a-glance” visual confirmation ofcontent. Additionally, such implementations provide for a more extensivevisual coverage of the current topic. Such implementations typicallyprovide an interactive mechanism, for example, a swipe on the personalcomputing device's touch sensitive screen, to visually provide the fullcoverage of the topic currently being disclosed.

Navigation errors occur when a user utters a command that is notapplicable in the current part of the flow (e.g. saying “Forward” whilein the commentary”). These errors should be handled with audioorientation cues provided by the system (e.g. “There is no more contentfor this story”). In some implementations, the system responds byreverting back to a default flow. Alternatively, some implementationsrespond by audibly providing a shortened menu of possible actions basedupon the user's current location in the flow.

FIG. 6 is a representation of a sample user interface 600 for a mobiledevice that supports aural navigation flows. The aural flow experienceconsists of two main components, as highlighted in the figure: Selectingthe flow 620 and experiencing the flow 640. In Selecting the flow 620,the system provides to the user several options to choose and customizethe coverage of the available content, based on time constraints, typesof aural flow and user's interest. A simple sequence of user interfacescreens is shown supporting this selection task. Once the user hasselected values for such simple parameters, the system immediatelygenerates and makes available the aural flow corresponding to the user'sselection. At this point, the user enters the Experiencing the flow 640part, in which the system plays the aural flow, which concatenates theweb pages through self-activating links.

FIG. 7 is a representation 660 of accelerometer-based shake gesture tointeract with an aural flow. In some implementations, the aural flow canbe interacted and altered through vocal and/or tactile user actionsand/or a locational value of the personal computing device. In suchimplementations, activating a microphone will temporarily stop thesystem output and activate the “listening mode.” During this pause, thesystem will wait for a command. If the button is released with nocommand having been uttered, the system will simply resume its output.If a command was uttered and understood by the system, the system willreact accordingly. Shaking the personal computing device and utilizingthe accelerometer to activate the listening mode works similarly tocuring the microphone.

The locational input utilizes a geographical positional system componentthat is typical of many personal computing devices. However, thelocational input functionality differs from the input of direct useractions. Locational input is typically configured by the user to respondin certain ways to locational values. For example, a user couldconfigure the system such that the content, as presented upon arrivingat the user's place of employment, consist of the latest topics on thecompany's intranet. As another example, a user could request that thecontent be based upon the user's geographic position.

FIG. 8 is a block diagram of a personal computing device 700 capable ofimplementing a portion or all of the described technology. The exampleof one such type of personal computing device 700 shows a block diagramof a programmable processing system (system) 700 suitable forimplementing apparatus or performing methods of various aspects of thesubject matter described in this specification. The system 700 includesa processor 710, a random access memory (RAM) 721, a program memory 730(for example, a writable read-only memory (ROM) such as a flash ROM) andan input/output (I/O) controller 740 (typically endowed with GPScapability) coupled with a bus 750. The system 700 can be preprogrammed,in ROM, for example, or it can be programmed (and reprogrammed) byloading a program from another source (for example, downloaded from anapplication site, or another personal computing device).

The I/O controller 740 is operationally connected to I/O interfaces 760.The I/O interface receives and transmits data (e.g., stills, pictures,movies, and animations for importing into a composition) in analog ordigital form over communication links such as a serial link, local areanetwork, wireless link, and parallel link, cellular, touch and shakeinputs, geographic locational input, and the like.

TABLE 1 Sample system navigation and commands (primitives) CommandSystem System Action What's new?” “Recent stories in (topic) Begindefault news” full flow “Anything else (like this)?” “Related stories”Go to related “More like this” stories “Next” or “Skip” “Next story” Goto next story “Previous” or “Back” “Previous story” Go to previous storyin user history “Stop” or “Pause” Earcon Pauses story “Resume”“Continue” “Resuming (headline)” Resumes story or “Play” “Listen to” “Goto” “Switching to (topic) Switch to selected “Switch to” or topic“Change to” news” “Forward” or “Rewind” Title of next section is Movebetween read sections within a story “Restart” or “Start over”“Restarting (reads Restarts story headline)”

TABLE 2 example of the different characteristics of aural flow typesFlow Characteristics Time Advantages Disadvantages Group A selected 5min Decide the Interact every group of category time to select a topicsfrom different the outset category Full All groups of Longer LessDifficulty topics period of interaction building mental time - 30 modelmin. Deep All groups of Longer In-depth Difficulty topics + period ofcoverage building mental semantic time - 1 hr. of content modelassociations Light Agile Shorter More stories Details of each overviewperiod in less time topic will not be of each topic of time (agileplayed (default overview) dialogue act) Rich Extensive Longer ExtensiveTime- coverage period coverage consuming and of each of timeconstraining topic (all dialogue acts)

Embodiments of the subject matter and the operations described in thisspecification can be implemented as a method, in digital electroniccircuitry, or in computer software, firmware, or hardware, including thestructures disclosed in this specification and their structuralequivalents, or in combinations of one or more of them. Embodiments ofthe subject matter described in this specification can be implemented asone or more computer programs, i.e., one or more modules of computerprogram instructions, encoded on computer storage medium for executionby, or to control the operation of, data processing apparatus.Alternatively or in addition, the program instructions can be encoded onan artificially-generated propagated signal, e.g., a machine-generatedelectrical, optical, or electromagnetic signal that is generated toencode information for transmission to suitable receiver apparatus forexecution by a data processing apparatus. A computer storage medium canbe, or be included in, a computer-readable storage device, acomputer-readable storage substrate, a random or serial access memoryarray or device, or a combination of one or more of them. Moreover,while a computer storage medium is not a propagated signal, a computerstorage medium can be a source or destination of computer programinstructions encoded in an artificially-generated propagated signal. Thecomputer storage medium can also be, or be included in, one or moreseparate physical components or media (e.g., multiple CDs, disks, orother storage devices).

The operations described in this specification can be implemented asoperations performed by a data processing apparatus on data stored onone or more computer-readable storage devices or received from othersources.

The term “data processing apparatus” encompasses all kinds of apparatus,devices, and machines for processing data, including by way of example aprogrammable processor, a computer, a system on a chip, or multipleones, or combinations, of the foregoing The apparatus can includespecial purpose logic circuitry, e.g., an FPGA (field programmable gatearray) or an ASIC (application-specific integrated circuit). Theapparatus can also include, in addition to hardware, code that createsan execution environment for the computer program in question, e.g.,code that constitutes processor firmware, a protocol stack, a databasemanagement system, an operating system, a cross-platform runtimeenvironment, a virtual machine, or a combination of one or more of them.The apparatus and execution environment can realize various differentcomputing model infrastructures, such as web services, distributedcomputing and grid computing infrastructures.

A computer program (also known as a program, software, softwareapplication, script, or code) can be written in any form of programminglanguage, including compiled or interpreted languages, declarative orprocedural languages, and it can be deployed in any form, including as astand-alone program or as a module, component, subroutine, object, orother unit suitable for use in a computing environment. A computerprogram may, but need not, correspond to a file in a file system. Aprogram can be stored in a portion of a file that holds other programsor data (e.g., one or more scripts stored in a markup languagedocument), in a single file dedicated to the program in question, or inmultiple coordinated files (e.g., files that store one or more modules,sub-programs, or portions of code). A computer program can be deployedto be executed on one computer or on multiple computers that are locatedat one site or distributed across multiple sites and interconnected by acommunication network.

The processes and logic flows described in this specification can beperformed by one or more programmable processors executing one or morecomputer programs to perform actions by operating on input data andgenerating output. The processes and logic flows can also be performedby, and apparatus can also be implemented as, special purpose logiccircuitry, e.g., an FPGA (field programmable gate array) or an ASIC(application-specific integrated circuit).

Processors suitable for the execution of a computer program include, byway of example, both general and special purpose microprocessors, andany one or more processors of any kind of digital computer. Generally, aprocessor will receive instructions and data from a read-only memory ora random access memory or both. The essential elements of a computer area processor for performing actions in accordance with instructions andone or more memory devices for storing instructions and data. Generally,a computer will also include, or be operatively coupled to receive datafrom or transfer data to, or both, one or more mass storage devices forstoring data, e.g., magnetic, magneto-optical disks, or optical disks.However, a computer need not have such devices. Moreover, a computer canbe embedded in another device, e.g., a mobile telephone, a personaldigital assistant (PDA), a mobile audio or video player, a game console,a Global Positioning System (GPS) receiver, or a portable storage device(e.g., a universal serial bus (USB) flash drive), to name just a few.Devices suitable for storing computer program instructions and datainclude all forms of non-volatile memory, media and memory devices,including by way of example semiconductor memory devices, e.g., EPROM,EEPROM, and flash memory devices; magnetic disks, e.g., internal harddisks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROMdisks. The processor and the memory can be supplemented by, orincorporated in, special purpose logic circuitry.

To provide for interaction with a user, embodiments of the subjectmatter described in this specification can be implemented on a computerhaving a display device, e.g., a CRT (cathode ray tube) or LCD (liquidcrystal display) monitor, for displaying information to the user and akeyboard and a pointing device, e.g., a mouse or a trackball, by whichthe user can provide input to the computer. Other kinds of devices canbe used to provide for interaction with a user as well; for example,feedback provided to the user can be any form of sensory feedback, e.g.,visual feedback, auditory feedback, or tactile feedback; and input fromthe user can be received in any form, including acoustic, speech, ortactile input. In addition, a computer can interact with a user bysending documents to and receiving documents from a device that is usedby the user; for example, by sending web pages to a web browser on auser's client device in response to requests received from the webbrowser.

Embodiments of the subject matter described in this specification can beimplemented in a computing system that includes a back-end component,e.g., as a data server, or that includes a middleware component, e.g.,an application server, or that includes a front-end component, e.g., aclient computer having a graphical user interface or a Web browserthrough which a user can interact with an implementation of the subjectmatter described in this specification, or any combination of one ormore such back-end, middleware, or front-end components. The componentsof the system can be interconnected by any form or medium of digitaldata communication, e.g., a communication network. Examples ofcommunication networks include a local area network (“LAN”) and a widearea network (“WAN”), an inter-network (e.g., the Internet), andpeer-to-peer networks (e.g., ad hoc peer-to-peer networks).

The computing system can include clients and servers. A client andserver are generally remote from each other and typically interactthrough a communication network. The relationship of client and serverarises by virtue of computer programs running on the respectivecomputers and having a client-server relationship to each other. In someembodiments, a server transmits data (e.g., an HTML page) to a clientdevice (e.g., for purposes of displaying data to and receiving userinput from a user interacting with the client device). Data generated atthe client device (e.g., a result of the user interaction) can bereceived from the client device at the server.

While this specification contains many specific implementation details,these should not be construed as limitations on the scope of anyinventions or of what may be claimed, but rather as descriptions offeatures specific to particular embodiments of particular inventions.Certain features that are described in this specification in the contextof separate embodiments can also be implemented in combination in asingle embodiment. Conversely, various features that are described inthe context of a single embodiment can also be implemented in multipleembodiments separately or in any suitable subcombination. Moreover,although features may be described above as acting in certaincombinations and even initially claimed as such, one or more featuresfrom a claimed combination can in some cases be excised from thecombination, and the claimed combination may be directed to asubcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particularorder, this should not be understood as requiring that such operationsbe performed in the particular order shown or in sequential order, orthat all illustrated operations be performed, to achieve desirableresults. In certain circumstances, multitasking and parallel processingmay be advantageous. Moreover, the separation of various systemcomponents in the embodiments described above should not be understoodas requiring such separation in all embodiments, and it should beunderstood that the described program components and systems cangenerally be integrated together in a single software product orpackaged into multiple software products.

Thus, particular embodiments of the subject matter have been described.Other embodiments are within the scope of the following claims. In somecases, the actions recited in the claims can be performed in a differentorder and still achieve desirable results. In addition, the processesdepicted in the accompanying figures do not necessarily require theparticular order shown, or sequential order, to achieve desirableresults. In certain implementations, multitasking and parallelprocessing may be advantageous.

What is claimed is:
 1. A method comprising: generating, by a computer, amodel of a website using user interaction primitives to representhierarchical and hypertextual structures of the website; generating, bythe computer, a linear aural flow of content of the website based uponthe model and a set of user constraints; audibly presenting, by thecomputer, the linear aural flow of the content such that the linearaural flow of content is controlled through the use of user suppliedprimitives, wherein, the linear aural flow can be turned into a dynamicaural flow based upon the user supplied primitives.
 2. The method ofclaim 1 wherein user supplied primitives comprises a spoken command. 3.The method of claim 1 wherein the linear aural flow is further based ona ranking of current topics based upon each topic's page hits thewebsite has received.
 4. The method of claim 2 wherein interruptedaudibly presented content is bookmarked such that the bookmark ages overa user stated period and is eliminated upon an ending of a user statedperiod.
 5. The method of claim 1 wherein the set of user constraints isderived from a user's past audio browsing history in conjunction withthe device used to perform the past audio browsing.
 6. The method ofclaim 1 wherein the user supplied primitives are interpreted in contextof a user's session.
 7. The method of claim 1 wherein the linear auralflow sequences individual articles into dialogues for audio presentationincluding a dialog for an article's headline, a dialog for the article'ssummary, and a dialog for the article's content.
 8. The method of claim2 wherein a spoken command is a name of a category of content availableon the website.
 9. The method of claim 1 wherein the set of userconstraints is derived from popularity measures of articles present onthe website.
 10. A computer storage medium encoded with a computerprogram, the program comprising instructions that when executed by auser device cause the user device to perform operations comprising:receiving a model of a website, the model representing hierarchical andhypertextual structures of the website, wherein the model uses userinteraction primitives to represent the hierarchical and thehypertextual structures of the website; receiving a set of user derivedconstraints; generating a linear aural flow of content of the websitebased upon the model and a set of user derived constraints; audiblypresenting the linear aural flow of the content; determining whether auser command indicates a desire for a dynamic aural flow; upondetermining that a user command indicates a desire for a dynamic auralflow, audibly presenting a dynamic aural flow.
 11. The method of claim10, wherein interrupted audibly presented content is bookmarked suchthat the bookmark ages over a user stated period and is eliminated uponan ending of a user stated period.
 12. The method of claim 10 whereinthe set of user constraints is derived in part from a user's past audiobrowsing history in conjunction with the device used to perform the pastaudio browsing.
 13. A system comprising: a user device; one or morecomputers operable to interact with the device; instructions stored on amachine readable storage device for execution by the one or morecomputers, wherein upon execution the instructions cause the one or morecomputers to perform the operations of: generate a model of a website,the model representing hierarchical and hypertextual structures of thewebsite through usage of user interaction primitives; generate a linearaural flow of content of the website based upon the model and a set ofuser constraints; provide instructions to the user device causing theuser device to audibly present the linear aural flow of the content;upon receiving input from a user, provide instructions to the userdevice causing the user device to audibly present a dynamic aural flowof the content.
 14. The system of claim 13, wherein the one or morecomputers comprise the user device.
 15. The system of claim 13, whereinthe linear aural flow is further based on a ranking of current topicsbased upon each topic's page hits the website has received.
 16. Thesystem of claim 13, wherein the one or more computers comprise a serveroperable to interact with the device through a data communicationnetwork, and the user device is operable to interact with the server asa client.
 17. The system of claim 13, wherein the one or more computersconsist of one computer, the user device is a user interface device, andthe one computer comprises the user interface device.